CEM: Constrained Entropy Maximization for Task-Agnostic Safe Exploration

نویسندگان

چکیده

In the absence of assigned tasks, a learning agent typically seeks to explore its environment efficiently. However, pursuit exploration will bring more safety risks. An under-explored aspect reinforcement is how achieve safe efficient when task unknown. this paper, we propose practical Constrained Entropy Maximization (CEM) algorithm solve task-agnostic problems, which naturally require finite horizon and undiscounted constraints on costs. The CEM aims learn policy that maximizes state entropy under premise safety. To avoid approximating density in complex domains, leverages k-nearest neighbor estimator evaluate efficiency exploration. terms safety, minimizes costs, adaptively trades off based current constraint satisfaction. empirical analysis shows enables acquisition environments, resulting improved performance both sample for target tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Conditional entropy maximization for PET

Maximum Likelihood (ML) estimation is extensively used for estimating emission densities from clumped and incomplete nzeasurement data in Positron Emission Tomography (PEU modality. Reconstruction produced by ML-algorithm has been found noisy because it does not make use of available prior knowledge. Bayesian estimation provides such a platform for the inclusion of prior knowledge in the recons...

متن کامل

Maximum Conditional Likelihood via Bound Maximization and the CEM Algorithm

We present the CEM (Conditional Expectation Maximization) algorithm as an extension of the EM (Expectation Maximization) algorithm to conditional density estimation under missing data. A bounding and maximization process is given to speci cally optimize conditional likelihood instead of the usual joint likelihood. We apply the method to conditioned mixture models and use bounding techniques to ...

متن کامل

Safe exploration for reinforcement learning

In this paper we define and address the problem of safe exploration in the context of reinforcement learning. Our notion of safety is concerned with states or transitions that can lead to damage and thus must be avoided. We introduce the concepts of a safety function for determining a state’s safety degree and that of a backup policy that is able to lead the controlled system from a critical st...

متن کامل

Constrained Utility Maximization for Generating Visual Skims

In this paper, we present a novel algorithm to generate visual skims, that do not contain audio, from computable scenes. Visual skims are useful for browsing digital libraries, and for on-demand summaries in set-top boxes. A computable scene is a chunk of data that exhibits consistencies with respect to chromaticity, lighting and sound. First, we define visual complexity of a shot to be its Kol...

متن کامل

Interleaved Algorithms for Constrained Submodular Function Maximization

We present a combinatorial algorithm that improves the best known approximation ratio for monotone submodular maximization under a knapsack and a matroid constraint to 1−e −2 2 . This classic problem is known to be hard to approximate within factor better than 1− 1/e. We show that the algorithm can be extended to yield a ratio of 1−e −(k+1) k+1 for the problem with a single knapsack and the int...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i9.26281